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Problem  Definition 

A  robot  often  must  recognize  and  locate  objects  in  its  workspace,  or  more  informally,  must 
use  sensory  information  to  determine  ivkat  objects  are  where,  in  order  to  manipulate  them. 
Since  speed  of  operation  is  also  an  important  consideration  in  robotics  applications,  the 
interaction  of  sensing  and  action  should  take  place  using  a  minimal  amount  of  sensory 
data.  This  requires  methods  for  optimally  (or  near  optimally)  selecting  positions  at 
which  to  obtain  sensory  data.  Clearly,  the  notion  of  optimal  selection  of  new  data  points 
will  in  part  be  tied  to  the  specific  recognition  engine  used  to  interpret  those  data  points. 
In  previous  papers  [Gaston  and  Lozano-Perez  84;  Crimson  and  Lozano-Perez  84,  85a, 
85b]  we  have  presented  a  constraint  based  recognition  and  localization  technique  that 
uses  as  input,  sparse,  noisy,  occluded  measurements  of  the  position  and  orientation  of 
small  patches  of  an  object's  surface  obtained  from  any  of  several  sensing  modalities. 
Applying  this  recognition  system  to  such  sensory  input  data  results  in  a  small  set  of 
object  poses,  that  is,  a  set  of  transformations  taking  a  known  object  model  from  an 
intrinsic  coordinate  system  into  a  coordinate  system  defined  relative  to  the  sensor.  In 
this  paper,  we  consider  the  problem  of  disambiguating  from  among  this  fixed  set  of  object 
poses.  Note  that  the  set  of  poses  could  include  poses  corresponding  to  different  objects. 
To  disambiguate  among  a  set  of  interpretations,  we  need  to  acquire  sensory  data  that  will 
clearly  distinguish  one  pose  of  an  object  from  another,  using  as  few  additional  sensory 
points  as  possible.  Thus,  our  problem  is  to  optimally  select  places  at  which  to  obtain 
the  needed  sensory  data. 

While  we  use  the  recognition  system  developed  in  [Crimson  and  Lozano-Perez  84, 
85a,  85b|  as  the  basis  for  investigating  sensing  strategies  for  disambiguation,  we  expect 
that  some  of  the  results  of  this  investigation  should  have  application  in  more  general 
situations  of  recognition  and  localization.  To  illustrate  this,  we  begin  with  a  set  of 
examples  of  the  use  of  sensing  strategies. 


Example  I:  Disambiguating  Multiple  Interpretations 

Suppose  we  are  given  a  sparse  set  of  sensory  data  points,  each  recording  the  position 
and  orientation  of  a  small  patch  of  some  surface  in  the  workspace  of  a  robot.  Our  goal 
is  to  determine  what  objects,  from  a  set  of  known  objects,  are  consistent  with  this  data, 
together  with  the  pose  (position  and  orientation)  of  the  object  that  leads  to  such  a 
consistent  interpretation.  In  the  case  of  sensory  data  known  to  all  lie  on  one  object,  we 
take  consistent  to  mean  that  a  rigid  transformation  of  the  object  will  cause  all  of  the  data 
points  to  lie  on  the  object,  with  the  correct  surface  orientation  (to  within  some  known 
error  bounds).  In  the  case  of  sensory  data  that  may  come  from  more  than  one  object,  we 
take  consistent  to  mean  that  a  maximum  subset  of  the  data  satisfies  the  above  condition. 
In  this  case,  of  course,  other  interpretations  of  consistent  are  possible. 

In  [Grimson  and  Lozano-Perez  84,  85a.  85b’  we  described  an  efficient  constrained 
search  technique  for  matching  the  sensory  data  to  fares  of  an  object  model,  in  order 
to  find  the  interpretations  of  the  data.  The  sensory  data  consist  of  measurements  of 
the  position  and  surface  orientation  of  small  patches  of  object  surfaces.  The  objects 
are  modeled  by  sets  of  planar  faces  equations.  The  technique  uses  efficient  constraints 


between  data  elements  and  model  elements  to  determine  the  set  of  interpretations  of  the 
data  consistent  with  the  model,  that  is  the  set  of  poses  of  the  object  that  agree  with  the 
input  data.  Empirical  testing,  as  well  as  theoretical  analysis  [Crimson  84],  indicates  that 
in  general,  there  will  be  only  one  consistent  interpretation  of  the  data.  It  is  possible, 
however,  that  more  than  one  pose  of  the  object  will  be  consistent  with  the  data,  even  for 
non-symmetric  objects  and  even  if  the  object  is  known  (see  Figure  1).  To  determine  the 
correct  pose,  we  will  need  additional  sensing. 


Figure  1.  Example  of  multiple  interpret-itions.  Given  only  .-i  sparse  set  of  isolated  data  points,  multiple 
interpretations  such  as  those  indicated  may  be  possible 

The  simplest  method  for  obtaining  the  supplementary  sensory  data  is  to  sample  the 
object  at  random.  If  the  sensing  process  is  fast  enough,  and  if,  on  average,  only  a  few 
additional  points  are  required  in  order  to  remove  the  ambiguity,  then  such  a  random 
sensing  strategy  could  suffice.  It  is  easy,  however,  to  find  situations  in  which  a  random 
sensing  strategy  would  be  ineffective  in  disambiguating  between  possible  interpretations 
and  in  general  one  expects  a  random  sensing  strategy  to  have  a  very  slow  convergence. 
Moreover,  some  sensing  modalities,  for  example,  tactile  sensing,  are  inherently  sparse, 
and  require  considerable  expense  to  obtain  additional  sensing  points.  In  this  case,  it  is 
particularly  desirable  to  perform  recognition  with  minimal  sensory  interaction. 
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Table  1  -  Histograms  of  points  needed  for  disambiguation  Each  column  indicates  the  number  of  sensory 
points  needed  to  force  a  unique  interpretation,  and  the  number  indicated  in  that  column  is  the  number 
of  trials,  out  of  100,  for  which  that  number  of  data  points  was  required 

For  example.  Table  1  lists  histograms  of  the  number  of  additional,  randomly  chosen, 
sensing  points  needed  to  uniquely  disambiguate  several  consistent  interpretations.  We 
generated  an  initial  set  of  9  points  of  data,  all  lying  on  a  single  object  (shown  in  Fig¬ 
ure  1),  and  determined  the  set  of  consistent  interpretations  of  that  data,  using  the  system 


s 


described  in  [Crimson  and  Lozano-Perez  85a,  85b].  We  then  generated  additional  sense 
points  until  only  one  interpretation  remained  consistent  with  the  data.  This  process 
was  repeated  for  100  trials,  and  the  number  of  sense  points  needed  to  disambiguate  the 
interpretations  was  histogrammed.  The  results  are  recorded  in  Table  1,  where  each  entry 
is  the  nutnber  of  trials  terminating  with  the  indicated  number  of  sense  points.  The  sen¬ 
sory  data  was  generated  by  randomly  choosing  approach  directions  towards  the  object, 
and  sensing  for  contact  along  them,  much  as  might  occur  in  tactile  sensing.  It  can  be 
seen  that  choosing  sensing  directions  at  random  may  have  a  slow  convergence  towards 
a  unique  interpretation,  especially  since  in  this  case  we  are  only  dealing  with  the  simple 
case  of  data  from  a  single  known  object. 


In  general,  one  would  expect  a  tradeoff  between  random  sensing  strategies  and  fea¬ 
ture  driven  sensing  strategies.  Given  two  possible  interpretations  of  the  data,  consider 
constructing  the  volume  difference,  consisting  of  all  points  contained  in  one  but  not  both 
of  the  interpretations.  If  the  size  of  this  volume  relative  to  the  volume  of  the  object  is 
large,  then  in  general,  one  would  expect  randomly  generated  additional  sensing  points  to 
quickly  disambiguate  the  situation.  On  the  other  hand,  if  the  relative  volume  is  small, 
one  would  expect  that  a  large  number  of  additional  sense  points  would  be  needed  before 
one  of  them  struck  this  volume  difference.  In  this  case,  a  more  directed  sensing  strategy 
is  likely  to  be  more  effective. 


Example  II:  Localization  with  Minimal  Sensing 


In  the  previous  example,  we  discussed  the  problem  of  generating  additional  sensory  data, 
given  some  initial  set  of  data  and  the  interpretations  consistent  with  it.  A  related  problem 
is  to  consider  the  optimal  acquisition  of  all  of  the  sensory  data,  rather  than  just  that 
needed  to  disambiguate  interpretations.  For  example,  consider  a  situation  in  which  a 
known  object,  with  a  fixed  set  of  known  stable  positions  is  being  sensed.  This  might 
be  the  case,  for  example,  when  considering  objects  in  pallets,  or  feeders.  We  would  like 
to  determine  the  pose  of  the  object  with  as  few  sensory  points  as  possible.  Here,  the 
initial  set  of  interpretations  is  the  set  of  stable  configurations  of  the  object.  Given  this 
set  of  stable  configurations,  we  want  to  determine  the  optimal  sensing  directions  for 
distinguishing  that  set  of  configurations. 


Example  III:  Simple  Inspection 


The  problem  of  determining  sensing  positions  can  also  arise  in  simple  inspection  tasks. 
Suppose  we  are  given  an  object  pose,  and  a  set  of  distinctive  points  defined  on  the  object 
model.  In  this  case,  we  may  be  able  to  use  the  techniques  developed  below  to  choose 
the  sensing  rays  needed  to  test  that  the  designated  distinctive  model  points  are  in  fact 
present  in  the  sensed  object. 
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Assumptions 


Thus,  the  problem  to  be  addressed  in  this  paper  is  finding  effective  and  rigorous  sensing 
strategies  for  deciding  between  a  set  of  possible  poses  of  an  object,  or  multiple  objects. 
We  will  assume  that  the  following  are  given: 

•  Set  of  Interpretations  —  Some  initial  set  of  possible  interpretations  is  assumed  given. 
This  could  be  either  from  the  application  of  some  recognition  process  to  a  set  of  initial 
sensed  points,  or  from  assumptions  about  the  object  to  be  sensed,  in  particular 
that  it  is  lying  in  one  of  a  known  number  of  stable  positions.  In  each  case,  the 
interpretation  includes  a  computed  transformation  giving  the  pose  of  the  model  in 
sensor  coordinates. 

•  Set  of  Sensing  Directions  —  It  is  assumed  that  the  initial  sensory  data  were  gen¬ 
erated  by  sampling  along  a  set  of  known  directions.  For  example,  in  the  case  of 
visual  sensing  these  could  be  given  by  the  orientation  of  the  cameras  relative  to  the 
workspace.  In  general,  determining  optimal  sensing  rays  is  a  four  degree  of  freedom 
problem.  In  this  paper,  we  assume  that  the  two  rotational  degrees  of  freedom  are 
restricted  to  a  small  set  of  possibilities  by  the  sensing  geometry,  such  as  the  given 
camera  orientations.  We  then  optimize  over  the  remaining  two  degrees  of  freedom. 

•  Polyhedral  Object  Models  —  We  assume  that  the  objects  to  be  sensed  have  been 
modeled  as  polyhedra,  although  the  objects  themselves  need  not  be  polyhedral.  Any 
deviations  between  curved  objects  and  their  polyhedral  models  will  simply  contribute 
to  a  small  amount  of  error  in  the  sensory  data,  to  which  a  recognition  system  should 
be  insensitive. 

The  goal  is  to  disambiguate  between  the  set  of  interpretations  by  determining  positions  at 
which  to  obtain  subsequent  sensory  information.  These  positions  should  be  such  that  by 
sensing  along  one  of  the  possible  directions,  the  recorded  information  will  disambiguate 
between  the  set  of  possible  interpretations  (or  some  subset  of  the  interpretations)  in  the 
presence  of  possible  error  in  the  computed  transformations  associated  with  each  of  the 
interpretations. 

In  the  examples  given  above,  we  assumed  that  we  had  available  techniques  for  ac¬ 
quiring  the  sensory  data,  and  techniques  for  solving  the  recognition  and  localization 
problem.  There  are,  of  course,  many  techniques  for  obtaining  information  about  the 
three-dimensional  positions  of  points  on  an  object,  as  well  as  the  local  surface  normals 
at  those  points.  Typical  examples  of  such  measurement  processes  tactile  sensing  [e.g. 
Harmon  82,  Hillis  82,  Overton  and  W'illiams  81,  Purbrick  81,  Raibert  and  Tanner  82, 
Schneiter  82],  binocular  stereo  le.g.  Baker  and  Binford  81,  Barnard  and  Thompson  80, 
Grimson  81,  85,  Marr  and  Poggio  79,  Mayhew  and  Frisby  81,  Ohta  and  Kanade  85], 
photometric  stereo  [e.g.  Ikeuchi  and  Horn  79,  Woodham  78,  80,  81',  laser  range-finding 
|e.g.  Lewis  and  Johnston  77,  Nitzan,  Brain,  and  Duda  77],  and  structured-light  systems 
le.g.  Popplestone,  et  al.  75,  Shirai  and  Suwa  71].  These  methods  can  provide  information 
about  the  three-dimensional  positions  of  points  on  the  object,  as  well  as  the  local  surface 
normals  at  those  points,  usually  with  some  error  in  the  measurements. 

A  number  of  different  techniques  have  been  developed  for  model-based  recognition 
and  localization.  If  one  views  recognition  as  a  .search  for  a  consistent  match  between 


data  elements  and  model  elements,  then  much  of  the  variation  between  existing  recog¬ 
nition  schemes  can  be  accounted  for  by  the  choice  of  what  descriptive  tokens  to  match 
Examples  of  techniques  relying  on  sparse  distinctive  features  include  the  use  of  a  few 
extended  features  [Perkins  78,  Ballard  81],  the  use  of  one  feature  as  a  focus,  with  the 
search  restricted  to  a  few  nearby  features  (Tsuji  and  Nakamura  75,  Holland  76,  Sugihara 
79,  Bolles  and  Cain  82,  Bolles,  Horaud  and  Hannah  83],  matching  of  high  level  descrip¬ 
tions  [Nevatia  74,  Nevatia  and  Binford  77,  Marr  and  Nishihara  78.  Brooks  81,  Brady  82] 
and  the  use  of  geometric  relationships  between  simple  descriptors  [Horn  83,  Horn  and 
Ikeuchi  83,  Ikeuchi  83,  Faugeras  and  Hebert  83,  Gaston  and  Lozano-Perez  84,  Grimson 
and  Lozano-Perez  84,  Stockman  and  Esteva  84,  Brou  84].  The  basis  for  the  present  work 
is  the  approach  presented  in  [Gaston  and  Lozano-Perez  84,  Grimson  and  Lozano-Perez 
84,  85a,  85b]. 

For  the  purposes  of  this  paper,  we  will  assume  that  such  techniques  are  available. 
Our  concentration  is  on  the  problem  of  choosing  optimal  sensing  strategies  for  interacting 
with  such  techniques. 


An  Algorithm  For  Computing  Sensing  Directions 

To  demonstrate  the  approach  of  computing  sensing  directions,  we  first  look  at  an  example 
in  two  dimensions  (see  Figure  2),  where  the  object  has  three  degrees  of  positional  freedom 
(one  rotational  and  two  translational). 


Figure  2.  Two  dimensional  example  of  multiple  poses  Both  poses  are  consistent  with  the  sensory  data, 
indicated  hy  the  small  surface  normals  and  the  points  of  contact. 

After  our  recognition  and  localization  process  has  been  applied  to  a  sparse  set  of  data 
points,  we  are  left  with  some  set  of  poses  of  the  object  consistent  with  that  data.  We  are 
given  a  set  of  sensing  directions,  that  is,  a  set  of  unit  vectors  s,  indexed  over  t  €  /,  such 
that  sensing  can  occur  along  directions  parallel  to  any  of  these  unit  vectors,  for  some  set 


of  initial  positions.  For  example,  in  Figure  3,  if  o  is  an  offset  vector,  where  o  •  ii  =  0, 
then  we  can  sense  along  the  ray  o  +  as  a  varies.  Equivalently,  we  can  think  of  this 
as  having  some  finite  portion  of  a  plane  perpendicular  to  St,  such  that  for  any  point  on 
the  plane,  we  can  sense  along  a  ray  through  that  point  in  the  direction  of  s^. 


Figure  3  Examptci  of  the  eensing  geometry.  Each  vector  (  defines  a  sensing  direction.  The  actual 
sensing  ray  is  defined  by  specifying  an  offset  vector  o  relative  to  the  origin  of  the  sensing  plane  through 
which  the  ray  must  pass,  parallel  to  t. 

We  are  also  given  some  bounds  on  the  sensitivity  of  the  sensing  device  in  measuring 
surface  normals  and  surface  positions.  In  particular,  we  define  tn  and  ej  in  the  following 
manner,  illustrated  in  Figure  4.  If  Utrue  is  the  surface  normal  at  some  point  on  sui  object, 
measured  in  sensor  coordinates,  and  n,ense  the  normal  measured  by  the  sensing  device, 
then 

'  ^true  >  <n- 

Pirue  is  the  actual  position  of  a  point  on  an  object,  measured  in  sensor  coordinates, 
3”*^  Psenft  is  the  position  measured  by  the  sensing  device,  then 

iP»en»e  ~  Pfruel 

Thus,  f„  and  €4  describe  the  range  of  uncertainty  in  the  measurements  of  normals  and 
distances,  respectively. 

The  basic  idea  is  that  over  the  set  of  all  given  sensing  directions  {8,ji  €  /},  we  want 
to  find  a  particular  direction  Sj,,,  and  an  offset  position  o,  such  that  sensing  along  the  ray 
0  +  QtSj,,  will  distinguish  the  poses.  By  distinguish,  we  mean  that  for  all  pairs  of  possible 
poses,  either  the  difference  in  the  expected  normals  of  the  faces  that  intersect  the  ray,  or 
the  difference  in  the  expected  positions  of  the  points  of  intersection  of  the  ray  with  the 
corresponding  faces  of  the  poses,  is  greater  than  the  sensitivity  of  the  sensing  device. 

We  note  that  a  sensing  ray  which  does  not  intersect  exactly  one  of  the  possible  poses 
is  acceptable.  Indeed,  in  the  case  of  two  possible  poses,  sensing  rays  that  would  contact 
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Figure  4.  Error  bounds.  The  true  surface  normal  is  known  to  lie  within  a  specified  cone  of  the  measured 
normal,  while  the  true  position  is  known  to  lie  within  a  specified  ball  about  the  measured  position. 


only  one  of  the  poses  are  likely  to  be  among  the  best  candidates  for  disambiguating 
the  two  poses.  Secondly,  we  note  that  if  there  are  many  possible  poses,  it  may  not  be 
possible  to  find  one  sensing  ray  that  will  distinguish  between  all  of  them.  Instead,  we 
may  have  to  use  a  series  of  measurements  to  determine  the  correct  pose.  The  number  of 
such  measurements  will  be  bounded  above  by  the  number  of  poses,  however. 

The  main  problem  to  be  faced  in  finding  good  sensing  rays  is  the  existence  of  error  in 
the  computed  transformations  associated  with  each  pose.  Thus,  for  the  sensing  strategy 
to  be  effective,  the  ray  must  both  distinguish  the  poses,  and  be  insensitive  to  errors  in 
the  position  and  orientation  of  the  poses. 

The  proposed  method  is  quite  simple  and  is  illustrated  in  Figure  5,  in  which  two 
poses  of  the  object  are  shown,  one  in  solid  lines,  the  other  in  hashed  lines.  The  steps  of 
the  method  are  as  follows. 

1.  Pick  a  particular  sensing  direction  s  (we  will  assume  the  convention  that  s  points 
from  the  sensor  towards  the  object).  In  the  two-dimensional  case,  we  can  define  a 
line  perpendicular  to  the  sensing  direction,  which  we  will  call  the  sensing  line,  with 
origin  at  the  point  on  the  line  closest  to  the  origin  of  the  sensor  space.  In  three 
dimensions,  this  would  be  a  sensing  plane.  This  is  shown  in  Figure  5a. 

2.  We  fix  the  position  of  this  line  at  some  arbitrary  reference  point,  for  example  by 
specifying  the  minimum  distance  of  the  line  from  the  origin  of  the  space  to  be  d. 
This  is  shown  in  Figure  5b. 

3.  Now  consider  one  of  the  poses,  for  example,  the  one  shown  in  hashed  lines  in  the 
figure.  For  each  face  /,  in  the  model,  with  corresponding  model  unit  normal  we 
let  ^nm,x  denote  the  unit  normal  rotated  into  sensor  coordinates,  i.e.  corresponding 
to  the  orientation  of  the  face  relative  to  the  pose  of  the  object.  If  the  face  points 
towards  the  sensor  (s  •  'nni.t  <  0),  we  project  the  boundaries  of  the  face  onto  the 
sensing  line,  as  shown  for  example  in  Figure  5c.  In  other  words,  each  end  point 
e  of  the  edge  is  projected  to  a  point  on  the  sensing  line,  e  (d  -  e  •  s)s.  In  three 
dimensions,  this  would  entail  the  projection  of  the  edges  of  a  face  onto  the  sensing 
plane. 


4.  We  can  label  the  resulting  segment  of  the  s-line  with  the  surface  normal  *nm,i  and 
with  the  range  of  distances  from  the  object  face  to  the  s-line.  That  is,  if  v  is  a  point 
on  the  edge,  in  sensor  coordinates,  then  v  •  s  -  d  is  the  distance  from  the  point  to 


Figure  5  Projection  of  Poses  onto  the  Sensing  Line  In  part  (a)  the  sensing  line  is  indicated,  orthogonal 
to  the  sensing  direction  In  part  (b)  this  line  is  fixed  at  a  distance  d  from  the  origin  In  part  (c)  the 
visible  faces  of  one  of  the  poses  are  projected  onto  the  sensing  line  defined  by  the  sensing  ray  This 
projection  defines  a  partition  of  the  sensing  line.  Here  S  is  the  sensing  ray  and  d  is  the  distance  to  the 
origin.  In  part  (d),  the  visible  faces  of  the  second  pose  are  also  projected  onto  the  sensing  line  In  part 
(e),  the  respective  partitions  are  tested  for  distinguishability,  based  on  differences  in  expected  surface 
orientation  and  differences  in  expected  position,  and  the  distinguishable  regions  of  the  sensing  line  are 
marked.  I'sing  a  sensing  ray  through  the  midpoint  of  either  of  the  two  marked  regions  would  enable  one 
t  '  doaiiibigii.ite  the  two  poses,  as  shown  in  part  (f),  m  which  the  expected  sensory  points  are  indicated 
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5.  When  all  the  visible  faces  of  a  pose  have  been  projected  onto  the  s-line,  we  can 
perform  hidden  surface  removal,  to  reduce  the  set  of  possibly  overlapping  segments 
of  the  8-line  to  a  set  of  disjoint  segments.  Each  segment  will  be  labeled  by  the 
surface  orientation  of  the  corresponding  face,  in  sensor  coordinates,  and  the  range 
of  distances  to  points  on  the  face. 

6.  We  can  perform  this  operation  for  each  pose,  obtaining  a  different  disjoint  partition 
of  the  s-line,  labeled  by  the  appropriate  surface  normals  and  distance  ranges,  as 
shown  in  Figure  5d  (slightly  offset  for  graphical  clarity). 

7.  Next,  we  intersect  the  set  of  all  such  partitions.  That  is,  we  define  a  new  partition  of 
the  s-line  with  two  properties.  First,  each  segment  of  this  new  partition  lies  within 
exactly  one  segment  of  each  of  the  partitions  of  the  s-line  corresponding  to  a  pose. 
Second,  this  new  partition  is  the  smallest  (in  terms  of  number  of  segments)  such 
partition.  The  label  associated  with  each  segment  of  the  new  partition  is  the  union 
of  the  labels  of  the  corresponding  segments  of  the  individual  partitions. 

8.  This  partition  can  now  be  analyzed  for  distinguishability.  More  precisely,  given  a 
segment  of  the  partition,  the  set  of  normals 

{“jli  e  J) 

associated  with  that  segment  is  distinguishable  if 

max  (ni-n,)<2e„. 

In  other  words,  given  a  measurement  of  the  actual  object  in  this  region,  we  can 
uniquely  determine  to  which  pose  it  corresponds.  Similarly,  the  set  of  distance 


measurements 


is  distinguishable  if 


{(^min.j’O^max.i)  ii  ^  } 


max  (  ICTniiii.t  Q^max,yi}  2(J. 


We  can  collect  all  such  distinguishable  segments  of  the  partition,  thereby  determining 
the  set  of  possible  sensing  points  along  the  particular  choice  of  s.  This  is  illustrated 
in  Figure  5(e). 

If  there  were  no  error  in  the  transformations  associated  with  the  poses,  we  would  be 
done,  since  any  point  in  this  set  would  disambiguate  the  poses,  (see  Figure  5(f)  for  an 
example).  To  account  for  possible  error  in  the  transformations  associated  with  the  poses, 
however,  we  need  to  be  somewhat  judicious  in  our  choice  of  sensing  point.  The  basic  idea 
is  to  choose  a  point  such  that  the  face  with  which  contact  is  made  remains  the  same  over 
small  perturbations  in  the  transformation.  In  two  dimensions  this  is  most  easily  done 
by  choosing  the  midpoint  of  the  longest  segment.  In  three  dimensions,  the  easiest  way 
to  choose  such  a  point,  from  among  the  set  of  distinguishable  polygons  on  the  sensing 
plane,  is  by  applying  the  notion  of  a  Chebychev  point,  defined  as  follows.  Suppose  we 
are  given  a  polygon  on  a  plane,  each  of  whose  edges  is  defined  by  a  pair  (fij, dj).  where 
Uj  is  a  unit  normal  lying  in  the  plane,  and  dj  is  a  constant  such  that  points  along  the 
edge  are  defined  by 

{v'v-n^  dj  “=  0}  . 

Then  the  distance  from  any  point  v  to  an  edge  is  given  by 


mmssmimam 


lO 


The  Chebychev  point  of  a  polygon  is  the  point  which  maximizes  the  minimum  distance 
from  the  point  to  any  edge  of  the  polygon,  that  is,  the  point  v  that  satisfies 

min  (v  •  fij  -  dj)  >  min  (u  •  -  dj)  Vu 

j  3 

where  the  value  taken  by  this  expression  at  the  Chebychev  point  is  called  the  Chebychev 
value  of  the  polygon.  Clearly,  the  polygon  with  the  maximum  Chebychev  value  will 
be  the  least  sensitive  to  perturbations  in  the  computed  transformations,  and  thus  the 
Chebychev  point  with  the  maximum  Chebychev  value,  as  measured  over  the  set  of  all 
distinguishable  polygons,  defines  the  best  sensing  position.  Note  that  we  can  improve 
the  reliability  of  the  sensing  strategy  even  further  by  choosing  the  maximum  Chebychev 
point  as  measured  over  connected  sequences  of  distinguishable  segments. 

9.  We  repeat  this  process  over  all  sensing  directions  Si,  choosing  the  direction  that  best 
distinguishes  the  feasible  poses. 

While  this  analysis  has  been  done  in  two  dimensions,  it  clearly  extends  to  the  general 
three  dimensional  case.  Here,  the  visible  faces  are  projected  into  polygons  on  a  sensing 
plane,  and  the  intersection  of  the  projections  over  all  poses  gives  a  partition  of  this  plane, 
which  can  be  tested  for  distinguishability. 


An  Implementation  of  the  Technique 

In  testing  the  proposed  algorithm,  we  have  chosen  a  slightly  modified  implementation  of 
the  technique,  that  avoids  some  of  the  difficulties  of  performing  hidden  surface  removal, 
and  of  intersecting  polygonal  partitions  of  a  plane.  One  means  of  circumventing  these 
difficulties  is  to  use  a  regular  grid  tesselation  of  the  plane. 

In  particular,  suppose  that  we  partition  the  s-plane  with  a  rectangular  grid  whose 
elements  have  sides  of  length  h.  Rather  than  trying  to  compute  polygonal  regions  on  the 
s-plane  that  are  distinguishable,  we  shall  examine  each  grid  segment  within  the  bounds 
of  the  projected  object,  seeking  those  segments  that  are  themselves  distinguishable,  and 
then  we  will  piece  these  grid  elements  back  together. 

The  steps  of  the  new  algorithm,  many  of  which  are  identical  to  those  of  the  previous 
solution,  are  sketched  below. 

•  Initially,  mark  all  grid  segments  as  active. 

•  Given  a  pose,  and  a  sense  direction  s,  test  each  face  for  visibility.  If  the  normal  of 
the  face,  in  sensor  coordinates,  is  given  by  ‘'nmii.  then  a  face  is  visible  if  s-  <  0. 

•  For  each  visible  face,  project  its  vertices  onto  the  s-plane,  resulting  in  a  set  of  new 
vertices  that  define  a  polygon  on  the  plane. 

•  Given  this  polygon  on  the  sensing  plane,  compute  the  smallest  bounding  rectan¬ 
gle  composed  of  an  integral  number  of  grid  elements  which  encloses  the  enscribed 
polygon.  This  rectangle  has  no  intrinsic  merit,  but  is  simply  a  convenient  means  of 
restricting  the  search  process. 

•  For  each  grid  element  lying  in  this  enclosing  rectangle,  apply  the  following  test.  If 
the  grid  segment  lies  entirely  outside  of  the  polygon,  nothing  is  done.  If  some  edge  of 
the  polygon  passes  through  the  segntent,  this  segment  is  marked  as  inactive.  If  the 
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grid  segment  is  still  active  and  lies  entirely  within  the  polygon,  a  label  is  attached 
to  the  grid  segment.  This  label  is  composed  of  two  elements.  The  first  is  the  normal 
of  the  face  whose  projection  resulted  in  the  current  polygon  on  the  sensing  plane, 
meeisured  in  sensor  coordinates.  The  second  is  the  range  of  possible  positions  that 
could  be  achieved  by  intersecting  a  sensing  ray  passing  through  a  point  in  this  grid 
segment  with  the  face  of  the  underlying  interpretation.  If  the  vector  o,  lying  in  the 
s-plane,  defines  the  midpoint  of  the  grid  segment,  this  range  is  given  by 

\/2h 

ao  ±  ^  tan  0 

where  qq  is  the  value  of  a  for  which  the  ray  o  +  as  intersects  the  face  of  the  pose,  h 
is  the  size  of  the  grid  segment,  6  is  given  by  cos  =  *n  ■  8,  and  *n  is  the  normal  of 
the  face  in  sensor  coordinates. 

•  Repeat  this  process  for  all  visible  faces.  This  results  in  a  set  of  active  grid  segments, 
each  of  which  is  labeled  by  possibly  several  labels  of  the  type  described  above. 
This  set  of  labeled  active  grid  segments  represents  the  equivalent  of  the  partition 
of  the  sensing  plane  described  in  the  ideal  solution.  Note  that  we  have  avoided  the 
hidden  surface  problem  by  incorporating  multiple  labels  for  a  grid  segment,  from  a 
single  pose.  This  may  reduce  the  number  of  distinguishable  segments,  by  applying 
additional  constraints  on  the  criteria  of  distinguishability,  but  it  also  greatly  reduces 
the  computational  expense  of  the  process. 

•  Once  a  partition  of  the  grid  is  obtained  for  each  pose,  test  the  grid  segments  for  dis¬ 
tinguishability.  First,  only  grid  segments  that  are  active  in  all  poses  are  considered. 
Such  a  segment  is  considered  distinguishable  if  for  all  pairs  of  sets  of  labels,  either 
all  the  face  normals  of  one  label  are  distinguishable  from  all  the  face  normals  of  the 
other  (in  the  sense  defined  in  the  previous  section),  or  all  the  distance  ranges  of  one 
label  are  distinguishable  from  all  the  distance  ranges  of  the  other  (also  in  the  sense 
defined  in  the  previous  section). 

•  Finally,  collect  the  set  of  distinguishable  grid  segments  into  convex  connected  com¬ 
ponents. 

•  Compute  the  best  sensing  position  as  the  center  of  the  largest  square  (with  sides 
an  integral  number  of  grid  segments)  that  can  be  placed  entirely  within  the  set 
of  distinguishable  grid  segments.  Note  that  if  the  square  has  sides  of  size  s  then 
the  Chebychev  value  for  the  segment  is  at  least  s/2.  This  process  can  be  repeated 
over  all  sensing  directions,  and  the  midpoint  of  the  largest  such  connected  convex 
collection  of  distinguishable  grid  segments  can  be  used  to  define  the  best  sensing 
position.  To  save  on  computation,  it  is  also  possible  to  define  a  minimum  size  for 
an  acceptable  convex  connected  component,  and  to  only  apply  this  process  until  the 
first  such  acceptable  component  is  obtained. 

In  Figure  6,  we  illustrate  the  above  technique  on  the  multiple  poses  of  Figure  1.  Note 
that  each  of  the  small  circles  denotes  a  point  on  the  grid  of  the  sensing  plane  that  is 
distinguishable.  We  can  then  determine  the  best  sensing  position  by  finding  the  largest 
square  area  filled  by  such  distinguishable  points.  The  figure  illustrates  the  computation 
for  each  of  three  different  sensing  rays. 
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Bounds  on  Transform  Errors 


In  order  to  use  such  an  algorithm,  we  need  to  determine  values  for  two  parameters.  First, 
errors  in  the  computed  transformation  associated  with  a  pose  will  affect  the  threshold 
needed  to  determine  distinguishability.  For  example,  if  there  is  no  error  in  the  computed 
transformation,  then  two  surface  normals  are  distinguishable  if  the  angle  between  them 
exceeds  the  range  of  error  in  measuring  such  normals.  When  error  is  present  in  the 
transformation,  its  effect  on  the  expected  surface  normals  must  be  added  to  this  thresh¬ 
old,  thereby  reducing  the  set  of  distinguishable  normals.  Second,  we  need  a  bound  on 
the  minimum  Chebychev  value  (or  its  approximation)  such  that  errors  in  the  computed 
transformation  will  not  affect  the  expected  values  of  the  sensor  along  the  sensing  ray. 
In  order  to  deal  with  these  parameters,  in  this  section  we  will  derive  theoretical  bounds 
on  the  possible  errors  in  the  computed  transformations.  In  doing  so,  we  will  also  derive 
criteria  that  can  be  imposed  on  the  computation  of  the  transformation  from  model  coor¬ 
dinates  to  sensor  coordinates  in  order  to  reduce  the  range  of  possible  error.  Depending 
on  the  sensor  data  available,  it  may  not  always  be  possible  to  satisfy  these  criteria,  in 
which  case  higher  possible  errors  will  have  to  be  tolerated. 


Computing  the  Transform 

There  are  many  different  methods  for  determining  the  transformation  from  model  co¬ 
ordinates  to  sensor  coordinates,  and  the  errors  associated  with  that  computation  will 
clearly  be  dependent  on  the  specific  method.  To  illustrate  the  disambiguation  technique 
developed  here,  we  choose  one  particular  scheme,  and  derive  specific  error  bounds  on 
the  model  transformation  for  that  scheme.  This  will  then  allow  us  to  actually  test  our 
disambiguation  algorithm.  We  being  by  reviewing  the  process  used  in  jGrimson  and 
Lozano-Perez  84]  for  computing  the  transformation  from  model  coordinates  to  sensor 
coordinates. 

We  are  given  a  set  of  possible  poses  of  the  sensed  data,  each  one  consisting  of  a  set  of 
triples  (p,,hj,  /,),  where  p,  is  the  vector  representing  the  sensed  position,  n,  is  the  vector 
representing  the  sensed  normal,  and  /,  is  the  face  assigned  to  this  sensed  data  for  that 
particular  pose.  We  want  to  determine  the  actual  transformation  from  model  coordinates 
to  sensor  coordinates,  corresponding  to  the  pose  |see  also  Crimson  and  Lozano-Perez  84;. 

We  assume  that  a  vector  in  the  model  coordinate  system  is  transformed  into  a  vector 
in  the  sensor  coordinate  system  by  the  following  transformation: 

V,  =  RVm  +  Vo 

where  /?  is  a  rotation  matrix,  and  Vq  is  some  translation  vector.  We  need  to  solve  for  R 
and  Vq. 

Rotation  Component 

Suppose  n^.i  is  the  unit  normal,  in  model  coordinates,  of  face  /,,  and  n.,,,  is  the  corre¬ 
sponding  unit  normal  in  sensor  coordinates.  Given  a  two  such  pairs  of  model  and  sensor 
normals,  an  estimate  of  the  direction  of  rotation  r,j  such  that  a  rotation  about  that 
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direction  would  take  Um.i  into  is  given  by  the  unit  vector  in  the  direction  of 

(®m,»  ~  ^s,i)  X  (®m,j  ~ 


If  there  were  no  error  in  the  snued  normals,  we  would  be  done.  With  error  included 
in  the  measurements,  however,  the  computed  rotation  direction  r  could  be  slightly  wrong. 
One  way  to  reduce  the  effect  of  this  error  is  to  compute  all  possible  fij  as  t  and  j  vary 
over  the  faces  of  the  pose,  and  then  cluster  these  computed  directions  to  determine  a 
value  for  the  direction  of  rotation  f . 


Once  we  have  computed  a  direction  of  rotation  f ,  we  need  to  determine  the  angle  9 
of  rotation  about  it.  This  is  given  by 

1  - 


cos  9  =  1  - 


sine  = 


1  -  (r-n,,i){r-nn,,i) 
(r  X  n,.i)  •  Am.i 
1  -  (f  n^.OCr-nm.i) 


(1) 


Hence,  given  f ,  we  can  solve  for  9.  Note  that  if  sin  9  is  zero,  there  is  a  singularity  in 
determining  9,  which  could  be  either  0  or  ir.  In  this  case,  however,  f  lies  in  the  plane 
spanned  by  and  Dm.t  and  hence,  only  the  9  =  ir  solution  is  valid. 


As  before,  in  the  presence  of  error,  we  may  want  to  cluster  the  r  vectors,  and  then 
take  the  average  of  the  computed  values  of  9  over  this  cluster. 


Finally,  given  values  for  both  f  and  9,  we  can  determine  the  rotation  matrix  R.  Let 
fxjfy,  denote  the  components  f .  Then 
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0 
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Note  that  in  computing  the  rotation  component  of  the  transformation,  we  have 
ignored  the  ambiguity  inherent  in  the  computation.  That  is,  there  are  two  solutions  to 
the  problem,  (f,^)  and  (-r, -ff).  We  assume  that  a  simple  convention  concerning  the 
sign  of  the  rotation  is  used  to  choose  one  of  the  two  solutions. 


TVansIation  Component 

.Next,  we  need  to  solve  for  the  translation  component  of  the  transformation.  Suppose 
we  consider  three  triplets  from  the  pose,  (p,  „n, (p,  /j),  and  (p,,fc,n,.*,  A) 

such  that  the  triple  product  nm.i  *  (nm,j  x  &m.k)  is  non-zero,  (i.e.  the  three  face  nor¬ 
mals  are  independent).  Then,  it  can  be  shown  that  the  translation  component  of  the 
transformation,  Vq,  is  given  by 

[Mm,.  •  (nm,j  X  n,n,fc)Ivo  =  (o,..  •  p,.,  -  d,)  (n,,j  X  n,,fc) 

+  (n*,fc  •  P,,it  -  dk)  (n»,,  X  n,.j) 

As  in  the  case  of  rotation,  if  there  is  no  error  in  the  measurements,  then  we  are  done. 
The  simplest  means  of  attempting  to  reduce  the  effects  of  error  on  the  computation  is  to 
average  Vq  over  all  possible  trios  of  triplets  from  the  pose. 
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Errors  in  the  computed  transformation 

We  now  consider  possible  errors  in  each  of  the  parameters  of  the  transformation,  as  a 
function  of  error  in  the  sensor  measurements.  The  results  are  summarized  below,  more 
explicit  details  may  be  found  in  the  appendix. 

Errors  in  f . 

Let  hm.t  be  the  unit  normal  of  face  /j,  in  model  coordinates,  let  be  the  associated 
unit  normal  transformed  into  sensor  coordinates,  and  let  be  the  actual  measured 
unit  normal,  in  sensor  coordinates.  Suppose  that  the  sensitivity  of  the  measuring  device 
was  e„,  that  is, 

Then  an  absolute  bound  on  the  possible  error  in  the  computed  value  for  the  direction  of 
rotation,  fc,  in  relation  to  the  true  direction  of  rotation,  ft,  is  given  by 

SjSj  ^l  -  rf2 

1  - 

where 

=  f  ~  ^ 

y  |nm,t  -  J 

^  ~  ^m.t  ■  ®m,t 

Note  that  if  T(j  is  close  to  c„,  then  the  error  bound  becomes  incre2U5ingly  large.  This 
is  to  be  expected,  since  in  this  case,  n,n,t  ^  i  <ind  thus  small  errors  in  the  position 
of  n  can  lead  to  large  errors  in  the  position  of  f.  Similarly,  if  rf  is  near  1,  large  errors 
can  also  result.  If  we  restrict  our  computation  (where  possible)  to  cases  where  "jft  and  r) 
are  small,  then  we  have  an  approximate  bound  on  the  error  in  computing  the  direction 
of  rotation  given  by 

f  f  ’  ^  f  n  ■ 

This  bound  is  supported  by  the  results  of  the  simulations  reported  in  [Crimson  and 
Lozano-Perez  84]. 


rt  Tc  > 


Errors  in  0. 


We  know  that  the  angle  of  rotation  0  is  given  by 

(f  X  n„,) 


tan  0 


nL 


r  X  n. 


,)■(*•  X  rim) 

where  Am  is  the  unit  normal  of  a  face  in  model  coordinates,  is  the  corresponding 
normal  transformed  into  sensor  coordinates,  and  f  is  the  direction  of  rotation. 

If  we  let  Tt  denote  the  true  direction  of  rotation,  f^  denote  the  computed  direction 
of  rotation,  and  n,  denote  the  measured  surface  normal  corresponding  to  Am-  ‘hen  the 


constraints  on  the  error  in  computing  the  angle  0  are  that  rt  ■  f  ^  >  cos  ip  and 

fn  =  cos  ip.  In  the  appendix  we  show  that  the  correct  value  for  0  is  given  by 

^  1  COSO 

tan  t  =  -  - - 

sin  V  COS  p 

where 

cos  V  =  Dm  •  Tt 


A*  > 


-(‘-■(srffe)) 

/  ft  xn^  \\ 


cos 


COSO  = 


\rt  X  n' 


Furthermore,  we  show  in  the  appendix  that  the  worst  case  for  the  computed  value  of  0 
is  given  by 

sin  uf  cos  (p  +  \ip  +  7j) 


where 


cos  a;  =  co8(^  +  V*)  -  (1  -  cos  i^)  cos  ^  cos  V* 
cos  $  =  cos  V'l 


cos  7  =  cos  <P  cos  Ip 


cos*  w 


We  could  use  these  expressions  to  derive  bounds  on  the  possible  variation  in  9  as  a 
function  of  4>  a^nd  ip,  but  this  is  a  rather  messy  task.  Instead,  we  show  in  the  appendix 
that  if  ip  and  ip  are  small,  then  an  estimate  for  A0  such  that 

tan  {0t  +  Atf)  tan  0e 

is  given  by 

|A^|  s»  |<^  +  V'l- 

This  bound  is  supported  by  the  results  of  simulations  reported  in  [Crimson  and  Lozano- 
Perez  841. 


Errors  in  ifv 

We  have  computed  expressions  for  the  possible  error  in  f  and  0.  In  particular,  we  will 
denote  the  error  in  0  by  Afl  and  the  vector  error  in  f  by  such  that  f  •  6f  =  0.  We  now 
consider  the  problem  of  estimating  bounds  on  the  possible  error  in  applying  the  computed 
rotation  matrix  to  an  arbitrary  vector  v.  We  know  that  the  rotational  component  of  the 
transformation  of  v  is  given  by 

R{T,0)y  =  CO8  0V  +  (1  -  cos^)  (r  •  v)f  +  8in9(f  x  v) 
where  i  and  0  are  the  parameters  determining  the  rotation. 

We  show  in  the  appendix  that  if  we  ignore  higher  order  terms,  a  Taylor  series 
expansion  yields  the  following  bound  on  errors  in  the  computed  value  of  a  rotation: 

|/?  (r  +  6r,  +  A«)  V  -  (r,  v]  <  (2  |  +  |  Atf  |)  |V| . 

Now,  if  the  errors  0  and  ip  are  small,  then  we  know  that 

IA#|  <  ti|. 


Furthermore, 


|5f|  =  |sin  01  as  |0| 

and  this  implies  a  bound  on  variation  in  v  of 

130  +  01  |v| . 

Moreover,  if  we  are  careful  to  restrict  our  computation  appropriately,  then  0  as  0,  and 
thus 

lfl(f  +  6r,6-\-  Afl)v  -  <  140|  jvj . 

Effective  bounds  on  rotation  errors 

Unfortunately,  this  is  still  a  fairly  weak  bound.  For  example,  an  error  cone  of  radius  ^ 
about  the  measured  surface  normals  would  give  rise  to  potential  errors  in  the  computed 
rotation  on  the  order  of  the  magnitude  of  the  rotated  vector.  This  is  obviated  to  a 
large  extent  by  the  fact  that  we  do  not  rely  on  a  single  measurement  in  computing  the 
transformation  parameters.  Rather,  we  use  several  sets  of  meeisurements,  and  use  the 
mean  value  as  the  result  when  computing  r  and  9. 

To  see  how  this  helps  reduce  the  effective  bound,  consider  the  following  argument. 
Suppose  that  the  error  in  computing  9  is  uniformly  distributed  over  the  range  [-20,201. 
If  we  take  n  measurements  and  average,  then  the  distribution  of  error  about  the  correct 
value  9^  should  approach  a  normal  distribution,  by  the  Central  Limit  Theorem.  If  we 
assume  a  uniform  distribution  for  the  error  in  each  measurement,  then  the  variance  in 
the  error  can  be  shown  to  equal 

40^ 

If  there  is  no  systematic  error  in  the  measurements,  i.e.  each  measurement  error  can  be 
considered  independent  of  the  others,  then  the  distribution  of  average  error  is  essentially 
a  zero-mean  normal  distribution  with  variance 

402 

and  hence  with  standard  deviation 

/40^ 

V  3n 

Similarly,  if  the  magnitude  of  the  error  vector,  6r,  associated  with  the  computation  of 
the  direction  of  rotation,  f,  is  uniformly  distributed  over  its  possible  range,  and  the 
measurements  are  independent,  then  the  distribution  of  error  in  2  |5f  j  is  given  by  an 
identical  normal  distribution,  since  the  maximum  error  in  |5f  i  is  essentially  0.  By  linearly 
combining  the  two  distributions,  the  error  in  the  computation  of  Ry  is  given  by  a  zero- 
mean  normal  distribution  with  variance 

3n 

While  an  absolute  bound  on  the  error  in  computing  Rv  is  given  by  4  |0|  |V|,  tighter, 
but  less  certain,  bounds  are  possible.  For  example,  if  we  impose  a  0.95  probability  that 
the  error  does  not  exceed  the  bound,  then  an  expression  for  this  bound  is  given  by  the 
normal  distribution  error  function,  and  in  this  particular  case,  by 

3  92\/2 
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As  the  number  of  samples  n  increases,  this  bound  becomes  increasingly  tighter. 

Note  that  while  we  have  assumed  a  uniform  distribution  of  the  errors  in  the  individ¬ 
ual  measurements,  this  is  not  a  critical  assumption.  Since  we  are  only  seeking  estimates 
for  the  bounds  in  computational  error,  other  distributions  will  give  similar  results. 

In  summary,  given  some  lower  bound  on  the  number  of  samples  to  be  used  in  com¬ 
puting  the  transformation  from  model  coordinates  to  sensor  coordinates,  and  given  that 
the  assumption  of  small  errors  in  the  measurements  of  surface  orientation  holds,  then  the 
error  in  the  computed  rotation  of  a  vector  v  is  given  by  a  zero-mean  normal  distribution, 
scaled  by  the  magnitude  {v|,  with  standard  deviation 


where  n  is  the  number  of  measurement  samples  and  4>  the  angle  of  maximum  error  in 
the  mesisurement  of  surface  orientation  at  each  sample  point. 


Errors  iu  vq. 


We  know  that  the  translation  component  of  the  transformation  is  given  by 

Vo  =  •  P,.,  -  di)  {n'^j  X 

■1"  ■  P.»,j  ~  ^  ^m,i) 

’  Ps,*  ~  dk)  (Om,t  ^  ®m,y) 

where  is  a  face  normal  in  model  coordinates,  ^  is  the  corresponding  face  normal 
transformed  into  sensor  coordinates,  p, ,  is  the  position  vector  of  the  contact  point  in 
sensor  coordinates,  and  d,  is  the  constant  offset  for  face  i. 

If  the  error  in  mejisured  surface  normals  is  given  by  £„  =  cos^  such  that  n«-ial„  >  €n 
and  the  error  in  measured  contact  positions  is  bounded  in  magnitude  by  tj,  then  the 
error  in  each  component 


•  P,.,*  -  dk)  X 


is  bounded  in  magnitude  by 


where 


y  's  sin  f  -  (s  -r  A)  sin  (f  -  2<^)|*  4  {s  4  A)*  sin  (2;)  sin  (4^) 


«  =  A'm.lt  •  Vs.k  dk 


A  <  CJ  V  2\  1  ~  fn 

COSf  =“ln.,  n'm.j- 

If  we  restrict  our  computation  to  cases  in  which  the  faces  are  nearly  orthogonal, 
thi  n  this  bound  on  the  components  of  the  translation  vector  reduces  to 

Is  -  (s  -  A)  cos  (20), . 
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Choosing  the  Parameters 

We  now  consider  deriving  a  formal  definition  of  distinguishability  as  applied  to  surface 
normals  and  to  distances.  Consider  first  the  case  of  distinguishing  poses  on  the  basis  of 
measured  surface  normals.  Suppose  that  o  denotes  the  angle  between  surface  normals 
associated  with  two  different  possible  poses.  What  is  the  minimum  size  of  a  needed  to 
distinguish  these  poses? 

Clearly,  the  expected  normals  must  differ  by  an  amount  that  is  bigger  than  the 
sensitivity  of  the  measurements  themselves.  Thus,  o  must  at  least  exceed  2cos“*  fn  =  2<j>. 
Since  there  is  also  some  error  associated  with  the  computed  transformations  associated 
with  each  pose,  the  angle  must  also  exceed  this  error.  By  the  previous  analysis,  if  we  use 
a  single  measurement  to  determine  the  rotation  matrix  R,  then  this  error  is  bounded  by 
|4^|  and  hence,  we  have  the  bound 

a>2<f>  +  2(4</>)  =  10(^. 

For  most  values  of  4>,  this  bound  is  far  too  large  to  be  of  much  use. 

If  we  use  several  measurements  to  compute  R,  however,  then  more  effective  bounds 
can  be  used.  As  shown  previously,  assuming  no  systematic  error  implies  that  the  error  in 
the  computed  surface  normal  associated  with  a  transformed  face  is  given  by  a  zero-mean 
normal  distribution  with  standard  deviation 


where  n  is  the  number  of  measurement  samples. 

This  gives  us  a  tighter  definition  of  distinguishable  surface  normals.  In  particular, 
if  a  denotes  the  angle  between  surface  normals  a.ssocialed  with  two  distinct  poses,  those 
poses  are  distinguishable  if 

Q  >  2<i>  4  2p<i>. 

The  first  term  denotes  the  range  of  possible  error  in  the  measurement  of  the  surface 
normals,  and  the  second  term  denotes  the  range  of  possible  error  in  the  expected  values 
of  the  surface  normals.  Here,  p  is  a  scale  factor  that  is  a  function  of  the  reliability  of  the 
error  bound.  That  is,  p{c)  denotes  the  point  in  the  normal  distribution  described  above 
such  that  c  percent  of  the  weight  of  the  distribution  lies  below  the  value  p. 

For  example,  if  the  cutoff  on  the  reliability  of  the  bound  is  0.95,  and  the  number 
of  measurements  involved  in  computing  the  transformation  is  at  least  10,  then  p  <  1.01 
and  thus  the  bound  on  two  surface  normals  being  distinguishable  is 

4.020. 

We  can  also  derive  a  formal  definition  of  distinguishability  based  on  position  mea¬ 
surements.  We  first  note  that  if  a  face  is  defined  by  the  pair  (lim,  d)  in  model  coordinates, 
such  that  a  point  v  lies  on  the  plane  of  the  fare  if 

V  •  n,„  -  d  =  0 

then  the  same  fare,  after  transformation,  is  defined  by  the  pair  [n^.d'),  where 

ti  fn  “  ^ 

d'  d  -  (Vf,  •  Rn„,)  . 
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We  let  the  error  eissociated  with  the  computed  value  of  be  denoted  by  w  such 
that  w  •  -  0.  The  magnitude  of  w  can  be  bounded  above  by 

isin(p<^)|, 

by  the  discussion  above.  We  also  let  the  error  in  computing  Vq  be  denoted  by  u.  We  now 
seek  a  bound  on  the  possible  errors  in  computing  the  point  of  intersection  of  a  sensing 
ray  with  an  object  face,  due  to  errors  in  the  computed  transformation. 

Suppose  that  the  sensing  ray  is  given  by  as  +  o  where  s  is  a  specified  unit  vector,  o 
is  a  specified  oflfset  vector  orthogonal  to  s,  and  o  is  the  free  parameter  specifying  position 
along  the  sensing  ray.  The  correct  parameter  of  intersection  of  the  sensing  ray  with  the 
transformed  face  is  given  by  the  value  of  a  such  that 

(as  +  o)  •  -  d'  =  0 


or 


d' 


at 


nL 


sn' 


On  the  other  hand,  if  we  include  the  potential  error  in  the  computed  transformation, 
then  the  point  of  intersection  is  given  by 

\d'  +  u-  +  w)  +  Vq  •  w]  -  o  •  +  w) 


a. 


8  •  (n'm  w) 


and  thus  the  difference  is  given  by 

u  •  +  (u  +  Vo  -  o)  •  w  8  •  w 

f  — - - - - - - - - -  - - - 

8-n^  +  8-W  8-n^  +  8-W 

As  a  consequence,  we  can  bound  the  error  in  the  expected  intersection  point  of  the 
sensing  ray  with  the  face  by 

^  |u|  +  (|u|  +  |Vo  -  Oi)|w|  |w| 

|8-n^|-|w|  |s-n^|-|wi 

where 

w  is  the  error  in  computing 
Of  is  the  predicted  intersection  point 
u  is  the  error  in  Vq 
8  is  the  sensing  direction 
o  is  the  sensing  offset  vector 

is  the  face  normal  in  transformed  coordinates 
Vo  is  the  computed  translation. 

Thus,  given  this  bound,  two  poses  are  distinguishable  if  their  expected  points  of  intersec¬ 
tion  are  large  enough, 

(ai  -  ajj  >  2f,i  -f  ri  -t-  rj. 

As  in  the  case  of  distinguishing  on  the  basis  of  surface  normals,  the  bounds  for  w  and  u 
may  be  too  large  to  be  practical.  We  can  reduce  these  bounds  by  using  several  measure¬ 
ments  to  determine  a  value  for  Vq.  As  in  the  previous  case,  this  will  lead  to  a  zero-mean 
normal  distribution  of  expected  error,  and  the  effective  range  of  error  will  be  reduced. 

Finally,  we  need  to  place  bounds  on  the  minimum  Chebychev  values  needed  to 
guarantee  that  perturbations  in  the  computed  transformations  will  not  cause  the  sensing 
ray  to  miss  the  intended  face  Let  r  denote  the  CMiebychev  value  associated  with  a 
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particular  face,  whose  transformed  unit  surface  normal  is  n„^,  and  where  the  unit  sensing 
ray  is  given  by  8.  Then  the  modified  Chebychev  value  in  the  sensing  plane  is  given  by 
c  •  s|.  At  the  same  time,  the  variation  in  the  position  of  a  point  on  the  face,  as  a 
function  of  error  in  the  computed  transformation,  is  given  by 

Ap  =  u  +  q 

where,  as  above,  u  denotes  the  error  in  the  computed  value  of  vq  and  q  is  the  error  in 
the  computed  value  of  ftp,  where  p  is  the  Chebychev  point  in  model  coordinates.  The 
magnitudes  of  these  error  vectors  are  bounded  by  the  expressions  derived  above.  Since 
the  directions  of  the  error  vectors  are  arbitrary,  the  condition  on  the  Chebychev  value 
required  to  ensure  contact  with  the  face  is 


c  > 


l«l  +  |q| 

“m-8 


Thus,  we  have  derived  conditions  on  the  parameters  of  the  disambiguation  algorithm 
needed  to  guarantee  the  performance  of  the  algorithm. 


Discussion  and  Examples 

When  Do  We  Compute  the  Sensing  Directions? 

We  have  described  a  technique  for  determining  optimal  sensing  directions.  We  have  still 
to  consider,  however,  how  to  interface  such  a  technique  with  the  general  problem  of 
recognition  and  localization.  The  simplest  method  is  to  obtain  some  initial  set  of  sensory 
data  points,  apply  our  recognition  technique,  and  then  use  the  disambiguation  process  as 
required,  based  on  the  current  set  of  consistent  poses.  For  example,  if  there  are  several 
consistent  poses,  we  could  choose  the  first  pair,  compute  an  optimal  sensing  direction 
based  on  that  pair  and  obtain  a  new  data  point.  Then,  we  could  determine  which  of 
the  set  of  poses  are  also  consistent  with  the  new  data  point  and  iterate.  This  technique, 
while  applicable  to  arbitrary  sets  of  objects,  heis  the  disadvantage  of  high  computational 
expense. 

In  situations  in  which  a  large  number  of  objects  are  possible,  we  may  not  be  able  to  do 
any  better  than  to  compute  sensing  points  as  needed,  based  on  the  current  set  of  fe2isibie 
poses.  In  situations  involving  a  single  object,  however,  there  may  be  an  alternative 
method  for  integrating  the  computation  of  sensing  positions  with  the  interpretation  of 
the  sensory  data. 

In  particular,  given  the  analysis  developed  here,  one  can  precompute  optimal  sensing 
rays  as  a  function  of  the  difference  in  transformation  associated  with  two  poses.  Take 
any  pair  of  poses  of  an  object.  There  exists  a  rigid  transformation  taking  one  pose  into 
the  other,  which  we  can  parameterize  in  some  fashion.  We  compute  the  optimal  sensing 
direction  for  this  pair  of  poses,  and  insert  it  into  a  lookup  table,  whose  dimensions  are 
indexed  by  the  parameters  of  the  relative  transformation.  Since  the  workspace  of  the 
sensory  system  is  bounded,  this  is  a  bounded  table  (that  is,  the  translational  degrees 
of  freedom  are  not  infinite  in  extent).  The  analysis  can  be  used  to  compute  an  opti¬ 
mal  sensing  ray  corresponding  to  each  entry  of  the  table,  where  the  parameters  of  the 
transformation  are  quantized  to  some  desired  level. 


Now,  when  attempting  to  disambiguate  two  possible  poses,  one  simply  computes 
the  difference  in  the  transformations,  looks  up  of  the  precomputed  sensing  ray  in  the 
appropriate  slot  of  the  table,  transforms  that  ray  by  the  transformation  associated  with 
the  first  pose,  and  then  senses  along  that  ray  to  obtain  an  new  data  point.  That  data 
point  is  added  to  the  current  set  of  sensory  data,  and  the  recognition  and  localization 
process  is  applied.  If  a  unique  pose  results,  the  process  is  stopped;  if  not,  a  new  sensing 
ray  is  obtained  and  the  process  continues. 

By  precomputing  the  sensing  rays,  we  can  avoid  the  computational  expense  associ¬ 
ated  with  finding  a  new  sensing  position,  and  at  the  same  time  take  advantage  of  the 
efficiency  of  the  technique  is  disambiguating  multiple  poses. 

Avoiding  False  Negatives 

We  have  seen  in  the  previous  discussion  that  the  analytic  error  bounds  on  the  computed 
transformations  for  any  pose  are  probably  too  large  to  be  practical.  We  argued  that  one 
way  to  reduce  these  bounds  was  to  use  several  measurements  in  the  computation  of  the 
transformation.  This  led  to  a  normal  distribution  of  error  in  each  of  the  components  of 
the  transformation,  and  thus,  given  a  level  of  desired  confidence  in  the  algorithm,  tighter 
bounds  on  the  parameters  were  possible.  In  this  case,  we  would  expect  that  in  general 
the  algorithm  will  succeed,  and  we  need  only  consider  alterations  to  the  algorithm  to 
deal  with  the  infrequent  case  when  the  errors  in  the  computed  transformation  do  exceed 
the  expected  thresholds.  There  are  two  situations  that  can  arise  in  this  case.  The  first 
is  that  the  perturbation  in  the  transform  causes  a  surface  normal  to  be  sensed,  that 
does  not  agree  with  any  of  the  expected  normals.  This  is  essentially  a  false  negative, 
since  it  implies  that  the  poses  are  not  distinguishable.  The  more  damaging  case  is  a  false 
positive,  in  which  the  perturbation  in  the  transformation  results  in  a  sensor  measurement 
that  coincidentally  agrees  with  the  wrong  pose. 

The  easiest  solution  is  to  use  more  than  one  sense  point.  In  this  manner,  false 
negatives  are  easily  handled,  since  the  expectation  is  that  not  all  sensed  points  will  give 
inconsistent  data.  This  will  be  especially  true  if  several  sensing  directions  are  used,  in 
particular  if  the  sensing  directions  are  orthogonal.  As  well,  it  is  likely  that  false  positives 
ran  also  be  detected,  since  the  expectation  is  that  the  correct  pose  will  be  found  by  most 
sen.sor  points,  again  especially  if  several  directions  are  used,  and  a  simple  voting  scheme 
will  arrive  at  the  correct  answer. 

Testing  the  Algorithm 

We  have  implemented  the  described  technique,  and  tested  it  on  a  number  of  examples. 
Because  the  worst  case  bounds  are  so  large,  we  used  the  approximations  described  above, 
with  the  expectation  that  on  occasion  an  incorrect  decision  would  be  made,  but  that  such 
errors  could  be  avoided  by  voting  over  several  additional  sensing  points. 

In  particular,  we  ran  the  algorithm  described  in  [Crimson  and  Lozano-Perez  84;  for 
an  object  in  arbitrary  orientation  relative  to  the  sensors  and  with  simulated  sensing  from 
thre<’  orthogonal  directions  Whenever  there  was  an  ambiguity  in  interpreting  the-  sensed 
data,  we  used  the  following  disambiguation  technique.  We  used  the  analysis  developed 
above  to  predict  a  sensing  ray,  and  for  each  pose  we  predict  ed  ranges  of  expected  values 


for  the  sensory  data  along  that  ray.  We  then  acquired  an  additional  sense  point  along 
the  chosen  sensing  ray,  and  compared  the  recorded  value  with  the  expected  ranges  to 
choose  a  pose. 

Using  a  variety  of  simulated  sensing  errors,  the  disambiguation  technique  was  applied 
to  1000  ambiguous  cases.  It  was  found  in  336  of  these  cases  that,  due  to  the  large 
errors  inherent  in  the  sensory  data,  the  algorithm  could  not  distinguish  reliably  between 
the  possible  solutions.  In  all  of  these  cases,  the  poses  differed  by  the  reassignment  of 
one  data  point  from  one  face  to  an  adjacent  face,  and  this  resulted  in  nearly  identical 
transformations  associated  with  the  poses.  Relative  to  the  error  resolution  of  the  sensing 
devices,  these  can  be  considered  to  be  identical  solutions.  In  633  of  the  cases,  the 
disambiguation  algorithm  was  able  to  determine  the  correct  pose  with  only  a  single 
additional  sensory  point.  In  the  remaining  31  cases,  the  algorithm  chose  an  incorrect 
pose  from  the  set  of  consistent  poses. 

We  also  ran  a  second  version  of  the  disambiguation  algorithm  on  the  same  set  of 
data.  In  this  case,  rather  than  using  predicted  range  of  values  to  choose  a  pose,  we 
simply  used  the  technique  to  generate  the  next  sensing  direction,  and  then  ran  the  RAF 
recognition  algorithm  [Crimson  and  Lozano-Perez  84,  85a]  with  that  sensory  point  added 
to  the  original  set  of  sensory  data.  In  this  case,  we  found  that  the  algorithm  identified 
the  correct  pose  in  all  664  cases,  with  only  from  1  to  3  additional  sensory  points  required 
to  complete  the  identification. 
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Appendix 


In  the  appendix,  we  present  a  more  detailed  error  analysis  of  the  computation  of  the 
transformation  from  model  to  sensor  coordinates. 


Errors  in  f 


We  begin  by  considering  the  range  of  possible  errors  in  the  computation  of  the  direction 
of  rotation,  r.  By  the  analysis  of  [Crimson  and  Lozano-Perez  84],  the  rotation  direction 
f  is  computed  by  taking  two  pairs  (Um^h^),  where  hm  is  the  unit  normal  of  a  face  of 
the  model,  and  is  the  same  unit  normal  rotated  into  sensor  coordinates,  and  letting 
r  be  the  unit  vector  in  the  direction  of 

(nm,i  -  nm,t)  ^  (»m.>  “  n'm.j)- 

We  assume  that  we  are  given  nm.tihm.t  that  the  sensitivity  of  the  sensor  to 
errors  in  surface  orientation  is  given  by  c„.  That  is,  if  ,  is  the  correct  surface  normal 
transformed  into  sensor  coordinates  and  h.,  is  the  actual  measured  (or  sensed)  surface 
normal,  then 

(4) 


A  a' 

n.  ■  n 


m.i  ^ 


We  will  consider  two  stages  in  deriving  bounds  on  the  error  in  computing  r.  If  we 


let 


and 


Ui 


•m,7 


®m,i 


n. 


n 


3,t 


in. 


then  the  correct  value  for  f  is  given  by 


Tt  = 


m,t 


V.  X  V. 


l-(vi-v,)^ 


and  the  computed  value  is  given  by 


u,  X 


V  • 

We  will  first  derive  bounds  on  v,  -u,  and  then  use  the  result  to  bound  Tf  • 
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The  vector  can  be  represented  by  the  following  parameterization 

=  ““m  +  +  5(n„i  X  n'rn). 

Then  equation  (4)  produces  the  inequality 

a  +  /37  >  e„  (4') 

where  7  =  fim  •  We  will  consider  the  worst  case,  in  which  equality  holds.  Further¬ 
more,  the  fact  that  n,  is  a  unit  vector  yields  the  following  constraint 

+  20^-)  +  +  6^  =  1.  (5) 

Given  n^,n^,  and  n,,  we  first  consider  the  range  of  possible  values  for  ~ 
relative  to  fim  -  that  is,  we  want  bounds  on  the  range  of  possible  values  for 

(Am  -  Pm)  •  (Pm  -  H,) 


E  =  vn  = 


»m  -  n. 


Mr 


n. 


It  is  straightforward  to  show  that 

|“m  -n'ml  =  \/2(l  -  7). 

Furthermore,  using  equation  (5),  one  can  show  that 

lOm  -  n*|  =  \/2(l  - 

Finally,  expanding  out  the  dot  product  and  substituting  yields 

E=  (I  ~^+  q)(1-7) 

2\/i  -  7\/l  ~  -  07 

By  equation  (4'),  a  =  Cn  ~  0^1,  and  substitution  yields 

E  =  \/l  -  711  +  fn  -  ^(1  4  7): 

2\/l  -  <n7  -  0(i  -  7^) 

The  first  problem  to  consider  is  what  is  the  minimum  value  for  E  as  0  varies.  In 
particular,  we  find  that 

dE  _  y/I  -  7(1  +  7)^  |/g(l  -  7)  -  (1  -  tn)] 

^  (I  0(1-1^))'' 

This  is  zero  when 

0=^-7^  (6) 


1-7 

and  this  is  a  valid  value  for  0  provided  7  <  £„.  Taking  a  second  partial  derivative  of  E, 
we  find  that  the  sign  of  d^Eld0^  is  given  by  the  sign  of 

^  (1  -  7*)  +  3  (tn  -  7)  tnTf  -  1  • 

Substituting  equation  (6),  we  find  that  the  sign  of  the  second  partial  derivative  is  given 
by  the  sign  of 

2(fn  -  7) 

and  this  is  positive,  since  7  <  £„.  Hence,  E  achieves  a  minimum  at  the  value  of  0  given 
by  equation  (6),  and  this  value  is 


/<n  -  7 

1-7 


(47 


If  7  >  £„,  then  the  minimum  value  for  E  occurs  for  0  at  the  limit  of  its  range, 
namely  0  =  \.  \n  this  case,  E  <  0,  and  is  minimized  when  7  =  y/e^.  taking  the  value 

-  (1  -  y/^) 

2 


E  =  — 
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In  general,  we  will  try  to  restrict  the  computation  of  f  to  those  cases  in  which  'i  <  tn, 
in  order  to  keep  the  magnitude  of  the  possible  error  in  r  small.  Note  that  the  minimum 
value  for  E  is  monotonic  in  'i,  that  is,  the  minimum  E  increases  as  7  decreases  towards 

-1. 


Thus,  we  obtain  the  bound 


where 

—  (nm,i  ‘  ®m,»)  • 

We  are  now  interested  in  obtaining  bounds  on  -fc.  In  essence,  we  have  two  cones 
in  the  Gaussian  sphere,  centered  about  Vj  and  Vj,  of  radius  6i  and  6j  respectively.  The 
possible  values  of  are  given  by  the  normalized  cross  products  of  vectors  within  these 
cones.  Clearly,  if  the  cones  overlap,  then  the  computation  for  r  is  unstable.  We  avoid 
this  case  by  requiring  that  the  cones  do  not  overlap. 

Note  that  if  all  the  error  in  the  computation  of  either  or  uy  lies  in  the  plane 
spanned  by  Vt  and  Vj,  then  the  normalization  of  the  cross  product  will  result  in  the 
correct  value  ft  =  fc-  Clearly  the  maximum  deviation  of  fc  from  ft  will  occur  when  the 
error  between  Ui  and  and  the  error  between  uj  and  Vj  lie  maximally  separated  from 
this  plane.  This  requires  that  we  check  two  cases,  one  in  which  the  errors  lie  on  the  same 
side  of  the  plane  and  one  in  which  the  errors  lie  on  opposite  sides  of  the  plane.  We  now 
consider  the  first  case. 


Let  =  ❖,  •  ❖j.  Then 


I  -  Sf 
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1-S? 


U,  X  u_,  =  6^  (v,  X  V^)  +  (jJ  p— ((Vi  X  Vj)  X  Vj]  -tSiJ  |Vi  X  (Vi  X  ^j)] 


Thus 

and 


(v,  X  Vj)  •  (a,  X  Uj)  =  S^Sj  [1  -  q*] 

ii,  •  Uj  =  SiSjti  +  -  6*0  -  6j. 

Then,  by  substitution, 

fffr.  > 


6,6 jy/]  -  rf^ 


yjl-  {6,6,f,^ 

In  the  second  case,  we  change  ily  to 

/l  -  6j 

6  /  . .  I 


(8) 


flj  6jVj 


V> 


(v,  <  Vj) 
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and  thus 


u,  X  Uj  =  6j6i  (v,  X  Vj)  +  6j^ ^  ((v,  X  Vj)  x  Vj]  -  [Vi  x  (v,  x  Vj){ 

Following  through  the  same  algebra  leads  to  a  bound  on  the  dot  product  of 


where 


rt  •  Tc  > 


^/l  -  -  y/\  -  -  52  I 


(9) 


T}  =  V,  -Vj 


.  “m,t  “m,t 

V.  =  - rr^ 

“m,t  *im,t 


It  —  Dm.t  • 


It  is  straightforward  to  show  that  the  bound  in  equation  (9)  is  in  fact  smaller  than 
the  one  in  equation  (8). 

Note  that  if '){  is  close  to  1 ,  then  the  error  bound  comes  increasingly  large.  This  is 
to  be  expected,  since  in  this  case,  n,„,,  «  ^  and  thus  small  errors  in  the  position  of 

can  lead  to  large  errors  in  the  position  of  r.  Similarly,  if  r)  is  near  1,  large  errors  can 
also  result.  If  we  restrict  our  computation  (where  possible)  to  caises  where  ~]i  and  rj  are 
small,  then  we  have  an  approximate  bound  on  the  error  in  computing  the  direction  of 
rotation  given  by 

Tt  -re  >  fn 

This  bound  is  supported  by  the  results  of  the  simulations  reported  in  [Crimson  and 
Lozano-Perez  84). 


Errors  in  9 


We  now  want  to  consider  bounds  on  the  possible  error  in  computing  the  remaining 
parameter  of  the  rotation  component  of  the  transformation,  namely,  the  angle  of  rotation 
9.  Given  the  expressions  in  equation  (1)  for  cos9  and  sin^,  the  value  of  9  is  given  by 


tan  9  ~ 


•  (r  X  n,„) 

(f  X  n'„)  •  (r  X  nm) 


where  is  the  unit  normal  of  a  face  in  model  coordinates  and  corresponding 

normal  transformed  into  sensor  coordinates. 


As  in  the  previous  section,  we  let  f(  denote  the  true  direction  of  rotation  and  Tc  the 
computed  direction  of  rotation.  We  will  assume  that  the  error  in  the  computed  direction 
of  rotation  is  bounded  by 

it  -Tr  >  Sr 

and  that  the  measured  value  for  Is  given  by  n,  such  that 

n'„  n..  >  f„. 
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We  shall  make  use  of  the  following  four  unit  vectors: 

ft  Am 


w 


l^t  X  Ami 
^  Am 


.■Si- 


8  = 


ft  X  a;^ 


.  ^  X  Am 


fc  X  Ami’ 


Given  these  definitions,  it  is  straightforward  to  show  that 

1  (Am-w) 


tan  =  rr 


rt  X  A' 


tan  <>£  =  — 


rc  X  n. 


I  (S'W) 
(A,  •  A) 
(t.A)- 


Our  method  in  obtaining  bounds  on  the  deviation  between  these  two  expressions 
will  be  to  bound  ■  u  as  a  function  of  Am  -  w,  and  to  bound  t  •  A  as  a  function  of  s  •  w. 
Once  we  have  bounds  on  these  expressions,  they  can  be  combined  to  bound  the  overall 
expression  for  tan  0. 

First,  we  consider  the  range  of  values  for 

(rc  X  Am)  (rt  X  Am) 

|rc  X  Ami  l^t  X  Ami' 

In  considering  the  range  of  values  for  u  •  w  we  note  that  because  of  the  normalization 
of  the  vectors,  any  error  in  fg  lying  in  the  ft  -  Am  plane  will  have  no  effect  on  the  dot 
product.  Thus,  the  worst  case  occurs  when  all  of  the  error  lies  perpendicular  to  this 
plane.  Hence,  we  need  only  consider  the  cases  where 

fc  =  aft  ±  /?(ft  X  Am), 

where 

a  > 

1  =  a*  +  (l  -  cos*  I/) 
cost/  =  ft  -  Am 

Now,  the  worst  case  will  occur  for  a  -  6r,  in  which  case, 

1  -  6* 


]  -  cos^  t/ 


so  that  the  worst  case  will  arise  for 

tc  =  6rft 


Vi 


1  -  6* 


'  C08‘  t/ 


(f ,  X  Am)  . 


In  this  case,  the  following  expressions  hold: 


Tc  X  n. 


i  1  8} 


6r  (ft  X  Am)  1  \/ - -  (Am  X  (Am  X  ft)) 

V  1  -  cos^  1/ 

(f,.  /  Am)  (f,  '  Am)  ^  1  «*C<IS*t/ 

(f,  X  Am)  •  (^1  ^  Am)  1  COS*t/ 

(f,  .  Am)  •  (f(  ■  Am)  ^  (1  C08*t/) 


or.  iTiora^j  iT,. 


SI 


Thus,  we  have  the  bound 
where 


u  •  w  >  /i 


1  —  cos*  1/ 


At  this  stage,  we  have  Ag  ■  n'^  >  c„  and  u  •  w  >  /x.  We  can  visualize  this  situation 
by  considering  two  cones  in  the  Gaussian  sphere,  one  centered  about  with  radius  t„, 
and  one  centered  about  w  with  radius  fji.  We  are  essentially  asking  for  bounds  on  the 
range  of  dot  products  between  vectors  lying  within  these  two  different  cones.  Assuming 
that  the  cones  do  not  overlap,  the  maximum  and  minimum  dot  products  will  occur  for 
the  minimum  and  maximum  angles  between  elements  of  the  cones,  respectively,  and  this 
clearly  occurs  for  vectors  lying  in  the  cones  and  lying  in  the  -  w  plane. 

Suppose  we  denote; 

cos  a  =  (n'^-w) 

COS<(>  =  «n 

cos  tp  =  Sr 
COS  ^  =  fi 

Clearly,  the  extremal  angles  for  these  two  cones  are  given  by 

o  ±  +  $] . 

Thus,  the  range  of  possible  values  for 

n,  •  w 


is  bounded  by 

cos  (a  ±  |<^  +  {]) . 

An  analogous  argument  can  be  made  for  the  dot  product  t  •  u.  If  we  let 
cos/9  =  (s  -w) 

COS')  =  p 

COSW  =  SrfnCOSV  -  *n 

=  COs{<f>  +  v)  -  (1  -  COS  I/)  cos  COS 

=  (re  -n..) 

where  p  is  the  bound 

/I  -  cos*  1/ 

1  ”  2 
1  -  COS^W 

then  the  range  of  possible  values  for 


t  •  u 


is  bounded  by 
Finally, 


cos{d  i  |^/'  +  Tfj) . 

|r,  X  ntnj  =  sini/ 
n,.|  -  sinoi. 
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Thus,  by  gathering  all  these  expressions  together,  we  obtain  the  following  worst  case 


expressions: 


tand# 


- 1  cos  a 


sin  1/  cos  ^ 

-1  cos(a  -  («^  +  0) 
sinw  cos(/5  +  (0  +  7)) 

We  note  that  this  expression  for  the  computed  value  of  tan  d  has  the  expected 
limiting  case  behavior.  In  particular,  if  the  error  parameters  0  =  0  =  0,  then  the  entire 
expression  for  tan^c  reduces  to  that  for  tan^t. 

We  now  seek  an  estimate  for  the  error  in  the  computed  value  of  9,  in  the  special 
case  of  0  and  0  both  small .  In  particular,  we  would  like  an  expression  for  A9  such  that 

tan  (9t  +  A0)  «  tan  (tf,;) . 

In  this  way,  we  can  place  a  bound  on  the  possible  error  in  the  computation  of  9. 

In  the  limiting  case  of  0  and  0  small,  the  bound  fs  Sr  so  that  ^  »  0.  Furthermore, 
cosuj  ss  cosi/  so  that  cos 7  «  cos 0 cos 0  cos(0+  0)  and  hence,  7  ss  0  +  0.  As  a 
consequence,  finding  an  approximation  for  the  deviation  in  tan  9  reduces  to  comparing 
the  worst  case  of  deviation  between 

tan  (9  1  = 

cos(/3 [0  +  20)) 

and 

_  ,  COSO  , 

10) 

cos  P 

By  expansion, 

^  tan9t  +  tanA9 

tan  (9t  A9)  =  - - - - — 

1  -  tan  9t  tan  A9 

and  if  we  substitute  Afl  jss  0  -I-  0,  and  use  equation  (10),  then  this  expression  can  be 
expanded  into 

cos  (a  -  0  -  0)  +  [cos  ,3  -  sin  a]  sin  (0  -I-  0) 

cos  (5  -t-  0  +  0)  +  [sin  0  -  cos  a|  sin  (0  +  0) 

Now  .  if  COS0  as  sin  a,  then  the  second  term  in  both  the  numerator  and  denominator 
can  be  ignored,  especially  since  sin  (0  +  0)  is  also  small.  Requiring  this  to  be  true  is 
equivalent  to  requiring  that 

(s- w)*  4  (»;„  -w)*  ss  1 

that  is,  that  the  component  of  the  unit  vector  w  in  the  direction  of  <  s  be  small.  It 
is  straightforward  to  show  that 

(w  ■  (n'„,  X  s))^  =  [cot  1/  (n^  •  s)]^ 

Since  we  have  already  indicated  that  we  will  restrict  our  computation  of  the  transfor¬ 
mation  parameters  to  those  cases  in  which  f  Am  <<  1  it  follows  that  coti/  is  small 

and  the  second  terms  in  both  the  numerator  and  denominator  in  equation  (11)  can  be 
disregarded. 

By  dropping  these  terms,  we  see  that  the  remaining  expression  reduces  to 

m  I .  ,  .V  (»  ■  1^  +  V’l) 

tan  (9t  0  0,)  ss - —  — - f  . 

cos  [0  •  [0  4-  0]) 

Thus,  if  0  is  small  enough,  it  follows  that  the  worst  ca.se  deviation  is  given  by 
9 .  ss  4  (0  4  0)  and  hence  that  a  good  approximation  to  the  error,  AS,  in  the  computed 
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value  of  the  rotation,  0,  is  given  by 

A0  fis  |(^  +  tAI . 


Errors  in  Rv 

We  have  computed  expressions  for  the  possible  error  in  f  and  0  In  particular,  we  will 
denote  the  error  in  0  by  A0  and  the  vector  error  in  r  by  i^r  such  that  r  -Sr  —  0.  We  now 
consider  the  problem  of  estimating  bounds  on  the  possible  error  in  applying  the  computed 
rotation  matrix  to  an  arbitrary  vector  v.  We  know  that  the  rotational  component  of  the 
transformation  of  v  is  given  by 

R{t,  0)v  =  CO3  0V  +  (1  -  costf)  (r  •  v)f  +  sin  0  (f  x  v) 
where  f  and  0  are  the  parameters  determining  the  rotation. 

We  first  consider  the  variation  of  this  expression  with  respect  to  the  angle  of  rotation. 
In  particular,  under  the  assumption  that  A0  is  small,  the  following  holds: 

R{r,0  +  A0)y  -  R(f,0)v  =  [vj  {(cos  (5  +  A0)  -  cos0)v 

+  (cos  0  ~  cos  (ff  +  A0))  (r  •  v)  r 
+  (sin  (0  +  A0)  -  sin  0)  (r  x  v)} 

«  |v|  {-ASsintfv  +  A0sin0  (r  •  v)f  +  Atfcostf  (r  x  v)}  . 
Straightforward  algebraic  manipulation  shows  that  the  magnitude  of  this  term  is  given 

by 

|v) 

and  this  is  bounded  above  by  A0  |v|. 

Next,  we  consider  the  variation  with  respect  to  f ,  so  that 

/f  (f  +  6r,tf)  -  i?(r,S)  =  lv|  {(1  -  cos^)  |(f  •  v)6f  +  (5f  •  v)  [f  +  ^r]] 

+  sin  0  [5f  X  r]}  . 

We  consider  the  magnitude  of  the  second  term  in  the  right  hand  side  of  this  expression, 
by  taking  the  dot  product  of  this  vector  with  itself.  If  we  ignore  terms  in  (6f  •  6r),  since 
the  assumption  of  6t  small  implies  such  terms  are  negligible,  then  the  magnitude  of  the 
second  term  in  equation  (2)  is  given  by 

1(1  -  COS0)  (5r  •  v)  +  sin^  (v  •  (f  x  ir))'  .  (3) 

We  now  consider  a  bound  for  this  expression.  Suppose  we  let  k  denote  the  unit  vector 
in  the  direction  of  St,  and  let  =  cos^.  Since  the  worst  case  will  occur  when  v 

lies  entirely  in  the  plane  spanned  by  St  and  f  x  St,  equation  (3)  reduces  to 
|^f|  1(1  -  cosfl)cos  j  +  sindsin  =  |5f|  |cosf  -  cos  (0  +  f)| . 

It  is  clear  that  the  worst  possible  value  for  this  expression  is  2  i6f|.  Thus,  the  maximum 
value  for  the  magnitude  of  the  second  term  in  equation  (2)  is  2  |6f  |  and  overall,  the 
maximum  deviation  due  to  a  variation  in  f  is  given  by 

2|vll6f:. 

Finally,  we  can  piece  together  these  two  variations.  By  ignoring  higher  order  terms, 
it  is  clear  that  a  Taylor  series  expansion  of  R(t,0)  yields  the  following  bound  on  errors 
in  the  computed  value  of  a  rotation: 

\R{t  St,0  +  A0)y  -  R{t,0)v\  <  (2  lif;  +  ]Atf|)  ]v| . 
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Now,  if  the  errors  <t>  and  tp  are  small,  then  we  know  that 

|Atf|  <  \(j>  +  tp\. 


Furthermore, 


l^rj  =  |sin  rp\  «  IV*! 

and  this  implies  a  bound  on  variation  in  v  of 

\S^p  +  <p\  jvl . 

Moreover,  if  we  are  careful  to  restrict  our  computation  appropriately,  then  xp 
thus 


(r  +  ^r, +  AB) v  -  R{r,6)\  <  \A<p\  |v 


(p,  and 


Errors  in  vq 

We  now  consider  bounds  on  the  error  associated  with  computing  the  translation  compo¬ 
nent  Vo  of  the  transformation.  Recall  that  the  correct  form  for  vq  is  given  by 

[Am, iDm, jDfn, fc) 'Vq  —  (Am,t  ‘  P«,t  ~  *^0  ^ 

■1“  (®m,j  ■  Ps,i  ~  ^ 

■1"  ■  Ps,k  ~  (Am,t  ^  ^m,j) 

where  Am.i  *s  a  face  normal  in  model  coordinates,  is  the  transformed  normal  in 
sensor  coordinates,  ,  is  the  position  vector  of  the  contact  point  in  sensor  coordinates, 
and  di  is  the  constant  offset  for  face  i.  We  will  consider  error  ranges  for  each  of  the 
components 

(Am.*  •  P^.*  -  (Am.i  X  Am,j) 

separately. 

We  let  s  -  Um.fc  •  p,,fc  -  dk  and  v  =  ,  x  ^  so  that  the  correct  component  is 

simply  s  v  and  the  computed  component  is 

(s  +  A)($v-t  »?u) 

where  u  is  a  unit  vector  orthogonal  to  v,  and  A,^  and  t)  are  values  to  be  determined. 
We  assume  that  the  measured  position  vector  is  given  by  p,  -t-  ^p,,  where  5pj  is  a  vector 
of  magnitude  fj,  and  the  measured  normal  is  given  by  such  that  n,,i  ■  Aj,j  j  >  Cn- 
First  note  that  the  magnitude  of  the  error  in  computing  the  component  of  the 
translation  is  given  by 

!sv  -  (s  -  A)  (Cv  +  r/u)|  =  y^s(l  -  0  ~  (v  •  v)  -t  (s  -I-  A)*.  (12) 

Thus,  we  need  to  find  bounds  for  s,  A,  (v  •  v) ,  ^  and  We  know  that  s  is  a  given  scalar 

value.  If  the  angle  between  the  face  normals  is  given  by  n,  •  lij  =  cosf,  then 

V  •  V  =  1  -  cos*  f  =  sin*  f . 

It  is  straightforward  to  show  that 

A  -  |(n,  •  (p  +  «p)  -  d)-  (A'„,  -  p  -  d)j 
+  |(n,  -pj 

<  <J  +  IpI  y/2\/l  -  t„. 

Next,  we  consider  bounds  for  ^,f?*,  where 

A.S,  >  A,,j  -  ^  ^  r;h 


(13) 
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for  some  unit  vector  b  orthogonal  to  x 

(n,,,  X  fis.j)  •  (fim.i  ^  ^m,j)  ~  (“«,»  '  “m,i)  (®m,j  ‘  ®m,j)  (“«>t  ‘  ^m,j)  (“*,J  ‘  “m,i  j 

>(l-  . 

Moreover,  the  worst  case  (i.e.  largest  value)  for  n,,i  •  ^'m,j  occurs  at  cos  (f  -  cos~*  e„) 
since  this  is  the  smallest  angle  possible  between  the  cone  of  radius  about  and 
the  vector  bm,j-  (Note  that  we  have  assumed  that  the  two  cones  do  not  overlap,  i.e. 
f  >  2cos~^  £„).  As  before,  we  let  f„  =  cos4>.  Then,  by  substitution  and  expansion,  we 
get 

{ns,i  X  n,,j)  •  x  >  sin? sin  (?  -  2<f>) . 

At  the  same  time,  from  equation  (13) 

(n..i  X  n*.j)  •  (njn.i  ^  “m.i)  =  ^  f 


so  that  we  have  the  bound 

^  sin  (?  -  2<f>) 

~  sin  ? 

Now  the  length  of  (n^.i  x  n»,j)  is  given  by 

1  -  (fia.t  •  “*,j) 

and  to  get  a  bound  on  r)^,  we  want  to  maximize  this  expression.  As  before,  the  worst 
case  occurs  when  the  n,,  vectors  lie  at  the  limits  of  their  respective  cones,  and 


(n^.i  -nsj)  =  cos(?  +  2<l>) . 

We  also  have,  however,  from  equation  (13), 

(n,,^  X  ns,j)  •  (flg.i  X  ns.j)  =  ^^sin^?  + 

<  1  -  cos^  (?  +  24>) . 


Substitution  and  expansion  yield  the  following  bound 

q*  <  sin  (4<^)  sin  (2?) . 

We  are  now  ready  to  bound  the  error  in  computing  each  component  of  the  translation 
vector  Vq.  From  equation  (12),  the  magnitude  of  the  error  is  given  by 

y/[s(1  -  0- ACl^(v-v)  +  q2(s  + A)^ . 

Substitution  of  the  various  bounds  yields 

y/ js  sin  ?  -  (s  +  A)  sin  (?  -  2^)]*  +  (s  +  A)^  sin  (2?)  sin  (4</>) 

where 

s  =  Os.fc  •  P,,fc  -  dk 

A  <  \/2v/l  -  fn 

COS?  =  iim.T  ■  ^rn,j- 

Note  that  as  0,  this  bound  reduces  to  [Asin?!.  Furthermore,  as  cj  0,  this 
expressions  tends  to  0,  so  that  the  error  in  the  computed  translation  vanishes  as  the  error 
in  the  measurements  do. 

Typically,  we  will  want  to  restrict  our  computations  to  cases  in  which  the  faces 
are  roughly  orthogonal,  so  that  ?  «  |.  In  this  case,  the  bound  reduces  to  the  simple 
expression 

|s  -  (s  +  A)  cos  (2(^)1 . 


